

AD-A195 158    FAST ON-CHIP DELAY ESTIMATION FOR CELL-BASED Emitter  
COUPLED LOGIC. (U) MASSACHUSETTS INST OF TECH CAMBRIDGE  
MICROSYSTEMS RESEARCH CE. P R O'BRIEN ET AL. FEB 86  
UNCLASSIFIED VLSI-MEMO-86-436 N00014-86-C-0622    1/1  
F/G 9/1 NL





MICROSCOPY RESOLUTION TEST CHART  
NATIONAL BUREAU OF STANDARDS 1963

DTIC FILE COPY



(4)

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

VLSI PUBLICATIONS

AD-A195 150

VLSI Memo No. 88-436  
February 1988

DTIC  
ELECTED  
S A D  
MAY 13 1988

## FAST ON-CHIP DELAY ESTIMATION FOR CELL-BASED Emitter COUPLED LOGIC

Peter R. O'Brien, John L. Wyatt, Jr., Thomas L. Savarino, and James M. Pierce

### Abstract

The goal of this work is to produce fast, but accurate, estimates of best and worst case delay for on-chip emitter coupled logic (ECL) nets. The work consists of two major parts: 1) macromodelling of ECL logic gates acting as both sources and loads; and 2) delay estimation for individual nets using the gate macromodel parameters and RC tree models for metal interconnect. Both of the above functions (gate macromodeling and delay estimation) have been extensively tested on an industrial ECL process and cell (i.e., logic gate) library.

The success of a macromodelling approach relies on repetitive use of members of a library of modelled cells. A "fixed" computational cost (several c.p.u. hours per cell) is paid to obtain simplified macromodel parameter values. Resultant timing estimates are typically within 5-10% of SPICE and are obtained roughly three orders of magnitude more quickly than SPICE.

**DISTRIBUTION STATEMENT A**  
Approved for public release  
Distribution Unlimited

Microsystems  
Research Center  
Room 39-321

Massachusetts  
Institute  
of Technology

Cambridge  
Massachusetts  
02139

Telephone  
(617) 253-8138

88 5 12 0 42

|                    |                      |
|--------------------|----------------------|
| Accession For      |                      |
| NTIS               | CRA&I                |
| DTIC               | TAB                  |
| Unpublished        |                      |
| Justification      |                      |
| By _____           |                      |
| Distribution/      |                      |
| Availability Codes |                      |
| Dist               | Available<br>Special |
| A-1                |                      |



#### Acknowledgements

This work was supported in part by the Digital Equipment Corporation, the National Science Foundation under Grant No. ECS83-10941, and the Defense Advanced Research Projects Agency under Contracts No. N00014-80-C-0622 and N00014-87-K-0825.

#### Author Information

O'Brien and Pierce: Digital Equipment Corporation, Marlborough, MA 01752; Wyatt: Department of Electrical Engineering and Computer Science, MIT, Room 36-864, Cambridge, MA 02139, (617) 253-6718; Savarino, Tangent Systems Corporation, Santa Clara, CA 95051.

Copyright© 1988 MIT. Memos in this series are for use inside MIT and are not considered to be published merely by virtue of appearing in this series. This copy is for private circulation only and may not be further copied or distributed, except for government purposes, if the paper acknowledges U. S. Government sponsorship. References to this work should be either to the published version, if any, or in the form "private communication." For information about the ideas expressed herein, contact the author directly. For information about this series, contact Microsystems Research Center, Room 39-321, MIT, Cambridge, MA 02139; (617) 253-8138.

# FAST ON-CHIP DELAY ESTIMATION FOR CELL-BASED Emitter Coupled Logic

Peter R. O'Brien, John L. Wyatt, Jr., Thomas L. Savarino, James M. Pierce

O'Brien, Pierce: Digital Equipment Corporation, Marlborough, MA 01752

Wyatt: Massachusetts Institute of Technology, Cambridge, MA 02139

Savarino: Tangent Systems Corporation, Santa Clara, CA 95051

## ABSTRACT

The goal of this work is to produce fast, but accurate, estimates of best and worst case delay for on-chip emitter coupled logic (ECL) nets. The work consists of two major parts: 1) macromodelling of ECL logic gates acting as both sources and loads; and 2) delay estimation for individual nets using the gate macromodel parameters and RC tree models for metal interconnect. Both of the above functions (gate macromodelling and delay estimation) have been extensively tested on an industrial ECL process and cell (i.e., logic gate) library.

The success of a macromodelling approach relies on repetitive use of members of a library of modelled cells. A "fixed" computational cost (several c.p.u. hours per cell) is paid to obtain simplified macromodel parameter values. Resultant timing estimates are typically within 5-10% of SPICE [1] and are obtained roughly three orders of magnitude more quickly than SPICE.

## I. INTRODUCTION

### Definition of terms

The definition of "metal delay" can best be illustrated by a simplified interconnect net with no branching and only one load gate [Fig. 1]. Let  $T_{AB}(0)$  represent delay through the *unloaded* source gate. Let  $T_{AB}(L)$  represent delay through the source gate *loaded* by an interconnect net of length  $L$ , as shown in Fig. 1. For all gates in our cell library,  $T_{AB}(0)$  is known. What



Figure 1: Simplified interconnect net.

our algorithm estimates is "metal delay" defined by:

$$\begin{aligned} D_{\text{metal}} &\triangleq T_{AC} - T_{AB}(0) \\ &= [T_{AB}(L) - T_{AB}(0)] + T_{BC}. \end{aligned} \quad (1)$$

So, "metal delay" has two distinct components: *extra delay through the source gate* caused by the loading of the source gate by metal, and *propagation delay* through the metal itself. Worst case (or "slow") metal delay is simply the definition in (1) evaluated using slow SPICE process parameters for the logic gates and metal interconnect, the maximum expected input risetime at point A, and a slow target voltage threshold at points B and C. The slow target voltage thresholds for rising and falling transitions are chosen to be, respectively:

$$V_{T,\text{slow,rise}} \triangleq \left( \frac{V_{LOW} + V_{HIGH}}{2} \right) + V_{\text{noise margin}} \quad (2)$$

$$V_{T,\text{slow,fall}} \triangleq \left( \frac{V_{LOW} + V_{HIGH}}{2} \right) - V_{\text{noise margin}}, \quad (3)$$

for some  $V_{\text{noise margin}} > 0$ . The definition of best case (or "fast") metal delay is made in a similar way with fast versions of SPICE parameters, input risetime, and output voltage threshold.

### Gate delay vs. interconnect delay

Previous work on waveform bounding and estimation for RC tree networks [2,3,4], with application to MOS circuits, has focused on the propagation component ( $T_{BC}$ ) of interconnect delay. Relatively simple models are used for the logic gates themselves. More recently, detailed macromodels for MOS logic gates [5] have been developed and used in conjunction with the RC tree delay estimation results. Good correlation with SPICE is obtained by fitting macromodel parameters to selected SPICE experiments. We develop macromodels specifically for ECL gates at a level of detail similar to [5]. This fills a definite need since most reported work in this area has been for MOS circuits, even though bipolar digital circuits are presently in wide use for high-performance applications. Recent work that has been reported for bipolar circuits [6,7,8] is concerned mainly with logic simulation, and the timing models used are relatively simple.

To emphasize the importance of accurately modelling the source gate (as opposed to just the interconnect), in Fig. 2 we plot separately the two components of (rising transition, worst case) metal delay versus total load net capacitance for a uniformly distributed metal line in the simplified topology of Fig. 1. The gate used as both the source and the load is a stan-

very high fanout), this could be done by adding a linear resistor and a d.c. current source in parallel with the capacitor [9,10]. The capacitance value is extracted from SPICE simulations of transient voltage at the load and current into the load. Since on-chip metal is modelled as a linear RC tree, and load gates as simple linear capacitors, this means an entire load net (metal and loads) is modelled as a linear RC tree.



Figure 2: Two components of  $D_{\text{metal}}$  vs. total load net capacitance.

dard 2-input OR/NOR. We denote the total load net capacitance by " $C_{\text{NET}}$ ," and the maximum expected value of  $C_{\text{NET}}$  by " $C_{\text{MAX}}$ ." For low values of  $C_{\text{NET}}$ , extra source gate delay dominates propagation delay: the two become equal only when  $C_{\text{NET}}/C_{\text{MAX}} \approx 0.51$ . Furthermore, the statistical distribution of  $C_{\text{NET}}$  is not uniform across  $[0, C_{\text{MAX}}]$ , but rather is skewed towards lower capacitance values. In fact, for our designs, 90% of the load nets have  $C_{\text{NET}}/C_{\text{MAX}}$  values under 0.25, where propagation delay is only 42% of extra source gate delay. In addition, for a falling transition, extra source gate delay is even more important than shown in Fig. 2, typically exceeding propagation delay throughout the entire range  $0 \leq C_{\text{NET}} \leq C_{\text{MAX}}$ . Therefore, extra source gate delay is typically the dominant component of metal delay.

## II. LOGIC GATE MACROMODELLING

### Load modelling

Modelling of ECL gates as *loads* is very simple. As in MOS, a single linear lumped capacitor is an acceptable load model. Modelling of leakage current in ECL loads does not appear to be necessary. For our process, the worst case (maximum expected metal resistance, maximum expected fanout) voltage drop through metal due to steady state leakage current is less than 3% of the rail-to-rail voltage swing. In situations where leakage current might be modelled (e.g., *clock* nets with

### Source modelling

Modelling of ECL gates as *sources* is more difficult. Because of an emitter-follower output stage, source gate behavior exhibits a fundamental asymmetry between rising and falling transitions. For a falling transition, there is a limitation on transient sinking current as  $C_{\text{NET}}$  increases. We use two different source model types: the first one for falling transitions when  $C_{\text{NET}} \geq C_{\text{THRESH}}$  (to model the transient sinking current limitation), and the second one for falling transitions when  $C_{\text{NET}} < C_{\text{THRESH}}$  and for all rising transitions. The "threshold" capacitance ( $C_{\text{THRESH}}$ ) is determined from SPICE simulations of transient source gate output current ( $I_{\text{out}}$ ) during falling transitions.  $C_{\text{THRESH}}$  is defined to be the value of  $C_{\text{NET}}$  where the sensitivity of  $|I_{\text{out}}|_{\text{max}}$  to a perturbation in  $C_{\text{NET}}$  drops below a pre-determined value.

To model the sinking current limitation, the first source gate model is simply a delayed current source pulse. The model delay before the onset of the current pulse and the magnitude of the pulse ( $I_{\text{SAT}}$ ) are extracted from the same SPICE simulations used to determine  $C_{\text{THRESH}}$ . The duration of the pulse is exactly long enough to sink the correct amount of charge:

$$\text{pulse duration} \triangleq \frac{C_{\text{NET}} (V_{\text{HIGH}} - V_{\text{LOW}})}{I_{\text{SAT}}} \quad (4)$$

The second source gate model is based on the source gate's *d.c. drive curves*, which show the static output voltage vs. input voltage relationship for different values of output load current. To describe a family of three-segment piecewise-linear approximations to the d.c. drive curves, four "d.c. parameters" are obtained by curve fitting to d.c. SPICE output (see also [3,5]). These d.c. parameters alone are sufficient to model the source gate's response to *slow* inputs, when the gate behaves quasi-statically. However, an ECL input is usually too fast for the source gate to respond quasi-statically. The source gate responds somewhat more slowly than the quasi-static model alone would predict. So, four additional "dynamic parameters" are extracted from SPICE data of transient source gate responses in order to model, as a function of  $C_{\text{NET}}$ , the departure from a purely quasi-static response.

Each of the two source gate models is used in conjunction with an approximation to the driving-point

admittance of the load net given by a single lumped RC segment ( $R_{NET}$ ,  $C_{NET}$ ). Based on values for  $R_{NET}$ ,  $C_{NET}$ , the source gate macromodel parameters, and the input risetime, a *closed-form analytical* expression for the model waveform  $v_B(t)$  is obtained. The detailed model expressions are omitted here for brevity, but can be found in [11].

#### Macromodel parameter summary

A total of  $2(lev)$  load gate macromodel parameter values (capacitance) are extracted for each cell, where  $lev$  denotes the number of input levels of the cell being modelled. (Note: the term "input level" refers to a subset of the individual input signals that affect the current-steering logic through the same number of base-emitter junction voltage drops.) Capacitance values are obtained for each combination of slow/fast SPICE parameters and different input level.

A total of 76 source gate macromodel parameter values are extracted for each cell: 12 for the first source model (i.e., falling transition and  $C_{NET} \geq C_{THRESH}$ ), and 64 for the second source model. For the first model: 3 parameters ( $C_{THRESH}$ ,  $I_{SAT}$ , and current pulse delay) for each combination of slow/fast SPICE parameters and "true/complement" output side of the cell. For the second model: 8 parameters (4 "d.c." and 4 "dynamic") for each combination of slow/fast SPICE parameters, "true/complement" output side of the cell, and rising/falling output transition.

### III. REDUCED-ORDER INTERCONNECT MODELS

#### Driving-point admittance approximation

As mentioned in the previous section, to enable the computation of an analytical expression for the model waveform  $v_B(t)$ , the driving-point admittance of the load net is approximated by a single lumped RC segment ( $R_{NET}, C_{NET}$ ). The values of  $R_{NET}$  and  $C_{NET}$  are chosen to match the first two terms of the Taylor series expansion around  $s = 0$  of the driving-point admittance of the given load net,

$$Y_{LOAD\ NET}(s) = \sum_{n=1}^{\infty} y_n s^n. \quad (5)$$

where the series representation is valid only within some circle of convergence  $|s| < r_{conv}$ . Our approximate driving-point admittance is:

$$\begin{aligned} Y_{APPROX}(s) &= \frac{sC_{NET}}{1 + sR_{NET}C_{NET}} \\ &= \sum_{n=1}^{\infty} (-1)^{n-1} R_{NET}^{n-1} C_{NET}^n s^n. \end{aligned} \quad (6)$$

where the second equality is valid only within the circle of convergence  $|s| < 1/R_{NET}C_{NET}$ . So to match both the  $s$  and  $s^2$  terms, we choose:

$$C_{NET} = y_1 \quad (7)$$

$$R_{NET} = -y_2/y_1^2. \quad (8)$$

The approximation is computed quickly using an algorithm [11] which allows the computation of  $y_1$  and  $y_2$  (and, hence, of  $R_{NET}$  and  $C_{NET}$ ) to proceed sequentially upstream from the leaf nodes of the load net until the source gate output is finally reached. The low-frequency nature of this approximation, implicit in the use of a Taylor series expansion around  $s = 0$ , turns out to be entirely justified. For our process, most of the frequency content of a typical source gate output waveform lies well inside the circle of convergence for both the true and approximate driving-point admittance [11].

#### Voltage transfer function approximation

We propagate the model source gate output voltage waveform ( $v_B(t)$ ) downstream to the load(s) of interest by convolving with an approximate unit voltage impulse response first developed by Horowitz [3]. We use an approximate impulse response because obtaining the precise impulse response of a general RC tree is too computationally expensive. In addition, closed-form analytical expressions are available for the approximate impulse response. This allows, via convolution with the model source gate output voltage waveform, computation of closed-form analytical expressions for the model voltage waveform(s) at the load(s) ( $v_C(t)$ ). The model voltage waveform at each load is then numerically inverted, at the appropriate voltage threshold, in order to compute the metal delay to that load.

Let  $h(t)$  and  $h_{approx}(t)$  denote, respectively, the true and approximate unit voltage impulse response at a given load. Let  $H(s)$  and  $H_{approx}(s)$  denote, respectively at the same load, the Laplace transform of the true and approximate impulse response. The approximate transfer function has two poles and one zero:

$$H_{approx}(s) = \frac{1 + s\tau_2}{(1 + s\tau_1)(1 + s\tau_2)}. \quad (9)$$

The time constants  $\tau_2$ ,  $\tau_1$ , and  $\tau_2$  are determined by the following three constraints:

$$\int_0^{\infty} t h_{approx}(t) dt = \int_0^{\infty} t h(t) dt \quad (10)$$

$$\int_0^{\infty} t^2 h_{approx}(t) dt = \int_0^{\infty} t^2 h(t) dt \quad (11)$$

$$\frac{1 + b_1 s + b_2 s^2 + \dots}{1 + (\tau_1 + \tau_2) s + a_2 s^2 + \dots} = H(s). \quad (12)$$

#### IV. RESULTS



Figure 3: Rising transition comparison.



Figure 4: Falling transition comparison.

In Figs. 3 and 4, we show comparisons of SPICE vs. our algorithm. In Fig. 3, we use the same SPICE data shown in Fig. 2. In Fig. 4, we use the same gate (2-input OR/NOR) and the same net topology (uniform unbranched line with a single load), but we examine a falling transition. Two points to note about Fig. 4 are:

1. the boundary between the two different source model types is  $C_{NET} = C_{THRESH} = 0.43C_{MAX}$  for this particular gate; and
2. the two  $T_{BC}$  curves are nearly indistinguishable on the time scale of the plot.

Assuming that gate macromodel parameters have been obtained in advance, the computation speed-up relative to SPICE is approximately three orders of magnitude. Similar accuracy and speed-up results are obtained using a wide variety of different logic cells and non-uniform branched net topologies [11].

#### ACKNOWLEDGMENTS

This work was supported by: the Digital Equipment Corporation, the National Science Foundation under Grant No. ECS83-10941, and the Defense Advanced Research Projects Agency under Contracts No. N00014-80-C-0622 and N00014-87-K-0825.

#### REFERENCES

- [1] L.W. Nagel, *SPICE2: A Computer Program to Simulate Semiconductor Circuits*, Memo ERL-M520, University of California, Berkeley, May 1975.
- [2] J. Rubinstein, P. Penfield, Jr., and M.A. Horowitz, "Signal Delay in RC Tree Networks," *IEEE Trans. Computer-Aided Design*, vol. CAD-2, no. 3, pp. 202-211, July 1983.
- [3] M.A. Horowitz, *Timing Models for MOS Circuits*, Ph.D. Thesis, Stanford University, 1983.
- [4] T. Lin and C.A. Mead, "Signal Delay in General RC Networks," *IEEE Trans. Computer-Aided Design*, vol. CAD-3, no. 4, pp. 331-349, October 1984.
- [5] M.D. Matson, *Macromodelling and Optimization of Digital MOS VLSI Circuits*, Ph.D. Thesis, Massachusetts Institute of Technology, 1985.
- [6] P. Kozak, A.K. Bose, and A. Gupta, "Design Aids for the Simulation of Bipolar Gate Arrays," *Proc. 20th Design Automation Conference*, Miami Beach, FL, pp. 286-292, June 1983.
- [7] I.N. Hajj, D. Saab, and B. Rosario, "Logic and Timing Simulation of Bipolar ECL Circuits," *Proc. 1984 IEEE Int. Conference on Computer-Aided Design*, Santa Clara, CA, pp. 194-196, November 1984.
- [8] I.N. Hajj and D. Saab, "Switch-Level Logic Simulation of Digital Bipolar Circuits," *IEEE Trans. Computer-Aided Design*, vol. CAD-6, no. 2, pp. 251-258, March 1987.
- [9] P. O'Brien and J.L. Wyatt, Jr., *Signal Delay in Leaky RC Mesh Models for Bipolar Interconnect*, M.I.T. VLSI Memo 85-278, November 1985. All VLSI Memos are available from the Microsystems Research Center, Room 39-321, M.I.T., Cambridge, MA 02139.
- [10] P. O'Brien and J.L. Wyatt, Jr., "Signal Delay in ECL Interconnect," *Proc. 1986 IEEE Int. Symp. on Circuits and Systems*, pp. 755-758, San Jose, CA, May 1986.
- [11] P. O'Brien and J.L. Wyatt, Jr., *Fast On-Chip Delay Estimation for Cell-Based Emitter Coupled Logic*. To appear in M.I.T. VLSI Memo Series in 1988. All VLSI Memos are available from the Microsystems Research Center, Room 39-321, M.I.T., Cambridge, MA 02139.

END

DATE

FILMED

8-88

DTIC